Genetics in Medicine — Latest Matching Preprints

1

Not Forgotten: Patient Experiences with Genetic Variant Reclassifications

Gupta, P.; Park, M. S.; Kao, E. Y.; McEwen, A. E.; Kumar, R. D.; Horike-Pyne, M.; Fowler, D. M.; Starita, L. M.; Knerr, S.; Stergachis, A. B.

2026-05-17 genetic and genomic medicine 10.64898/2026.05.06.26352483 medRxiv

Top 0.1%

73.4%

Show abstract

Purpose: Genetic variant reclassification is increasingly common in clinical genomics, yet limited data describe how patients experience re-contact and variant reclassification in routine clinical care. Methods: We conducted semi-structured qualitative interviews with 20 adult patients who received a variant reclassification following routine clinical genetic testing. Interviews explored emotional responses, communication experiences, and perceived value of genetic testing. Data were analyzed using Template Analysis, a form of thematic analysis. Results: Three overarching themes were identified. Participants identified a need for improved communication of reclassified results, particularly with respect to timing, modality, and contextualization (Theme 1). Experiences with reclassification also shaped perceptions of the value of genetic testing, with most participants viewing testing as worthwhile despite its evolving nature (Theme 2). Finally, many participants interpreted reclassification as evidence of personalized and ongoing care, reinforcing trust in genetic testing and biomedical research (Theme 3). Participants generally preferred to be informed of reclassified results regardless of reclassification type, although the direction of reclassification influenced emotional responses and preferred modes of communication. Downgrades from variants of uncertain significance to benign or likely benign were widely viewed as meaningful by participants. Conclusion: Variant reclassification was experienced as a signal of personalized, ongoing care. Timely, contextualized, patient-centered re-contact practices may reduce uncertainty, strengthen trust, and help patients not feel forgotten.

2

Phenotype-Specific Recalibration of MAVE Data Enables Repurposing of BAP1 Functional Assays for Kury-Isidor Syndrome

Gupta, P.; Balton, E. V.; Tejura, M.; Kumar, R. D.; Snyder, M. W.; Stone, J.; Villani, R. M.; Peter, B. H.; Sirisak, C.; Ian, G. A.; Martha, H.-P.; Danny, M. E.; Jane, R.; Elisabeth, R. A.; Andrew, S. H.; Mark, W.; Undiagnosed Diseases Network (UDN), ; Kathleen, L. A.; Matthew, B. D.; Melissa, M. J.; Gail, J. P.; Katrina, D. M.; Elizabeth, B. E.; Fowler, D. M.; Starita, L. M.; McEwen, A. E.; Stergachis, A. B.

2026-05-21 genetic and genomic medicine 10.64898/2026.05.15.26352805 medRxiv

Top 0.1%

70.9%

Show abstract

Purpose Multiplexed assays of variant effect (MAVEs) are transforming clinical variant interpretation. However, many genes are associated with more than one disease, making it unclear whether functional data generated in one disease context may be directly applicable to another. For example, germline BAP1 missense variants are associated with both BAP1 tumor predisposition syndrome (BAP1-TPDS) and Kury-Isidor syndrome (KURIS), a rare neurodevelopmental disorder. Here, we demonstrate how phenotype-specific calibration of BAP1 MAVE data enables disease-specific variant classification. Methods Saturation genome editing (SGE) data for BAP1 were recalibrated using either BAP1-TPDS- or KURIS-associated missense variants as pathogenic controls. Functional evidence strength was quantified using the Odds of Pathogenicity (OddsPath) framework and mapped to ACMG/AMP PS3/BS3 criteria. Recalibrated functional evidence was integrated with standard clinical criteria for variant classification. A workshop was developed to teach phenotype-specific MAVE recalibration to clinicians and variant curators and evaluated for educational impact. Results Phenotype-specific recalibration using BAP1-TPDS and KURIS controls yielded OddsPath values consistent with PS3_Strong evidence in both contexts. Application of KURIS-specific recalibration enabled the diagnosis of KURIS in an individual with a previously uncertain BAP1 missense variant. The educational workshop enabled quantitatively improved understanding in applying functional evidence. Conclusion Phenotype-specific recalibration enables appropriately calibrated reuse of MAVE datasets across distinct disease contexts, increasing the clinical utility of MAVE datasets and the interpretability of variants in pleiotropic genes. This framework expands the diagnostic utility of existing functional datasets without requiring new experimental assays.

3

Selection of Genetic Conditions for Multi-State Genomic Newborn Screening in BEACONS-NBS

Gold, N. B.; Johnson, B. A.; Somanchi, H.; Minten, T.; Coury, S. A.; Blout Zawatsky, C.; Begtrup, A.; Butler, E.; Langley, K. G.; Zimmerman, R.; McLaughlin, H. M.; Ellefson, T.; Kern, A.; Rehm, H. L.; Bick, D.; Brenner, S. E.; Kasperaviciute, D.; Abraham, R. S.; Aksentijevich, I.; Babinski, M.; Billington, C. J.; Butte, M. J.; Canna, S. W.; Caron, M.; Chan, Y.-M.; Chandrakasan, S.; Chiang, S. C. C.; Delmonte, O. M.; Diller, L. R.; Downie, L.; Fleischer, J.; Fulton, A.; Ganetzky, R. D.; Gold, J.; Goldbach-Mansky, R.; Grunebaum, E.; Hale, R. C.; Hamosh, A.; Hildebrandt, F.; Holtz, A. M.; Jacobse

2026-03-25 genetic and genomic medicine 10.64898/2026.03.23.26349079 medRxiv

Top 0.1%

70.1%

Show abstract

Introduction: BEACONS-NBS (Building Evidence and Collaboration for GenOmics in Nationwide Newborn Screening) is the first research study to integrate whole genome sequencing into newborn screening (NBS) across multiple U.S. states and territorial public health laboratory programs (PHLPs). We developed a list of conditions for screening. Methods: We designed inclusion criteria and assembled an initial condition list from published resources. The list was revised by clinical experts, molecular geneticists, genetic counselors, PHLPs, rare disease advocacy organizations, the BEACONS-NBS Community Advisory Board, and project leadership from the National Institutes of Health. For each condition, we provided a rationale for early detection, diagnostic signs or biomarkers, and treatments or surveillance strategies. Results: The BEACONS-NBS condition list includes 777 conditions associated with 743 genes, one copy number variant, and two aneuploidies and is larger than those used in other genomic NBS research studies in the U.S. and United Kingdom. Most conditions are inborn errors of immunity (37.2%), inherited metabolic disorders (18.7%), or endocrine conditions (18.1%). Nearly all conditions (93.3%) can be confirmed using a non-genetic test. Discussion: BEACONS-NBS has established a condition list for implementation across multiple state and territorial PHLPs, enabling the prospective evaluation of feasibility of population-wide genomic NBS.

4

Stratified evaluation of blood RNA sequencing in a rare disease cohort

Duzenli, T.; Durmus, S.; Kaya, H. E.; Sevilgen, F. E.; Kayhan, G.; Cakir, T.; Ergun, M. A.

2026-05-28 genetic and genomic medicine 10.64898/2026.05.27.26353804 medRxiv

Top 0.1%

57.7%

Show abstract

Background: RNA sequencing (RNA-seq) is increasingly recognized as a complementary tool to DNA-based sequencing for improving the diagnostic yield in Mendelian disorders. However, how the diagnostic performance of RNA-seq varies across molecularly and phenotypically distinct patient subgroups remains poorly defined. This study aimed to evaluate and compare the diagnostic utility of RNA-seq across three stratified groups of patients with non-diagnostic exome sequencing. Methods: We performed RNA-seq on whole blood samples from 90 patients with suspected Mendelian disease in whom clinical exome or whole-exome sequencing had failed to establish a molecular diagnosis. Patients were prospectively stratified into three groups of 30: (i) patients with a candidate variant of uncertain significance (VUS) with predicted splicing impact (Group 1), (ii) patients with a specific clinical pre-diagnosis but no identified pathogenic variant (Group 2), and (iii) patients without a specific pre-diagnosis or candidate variant (Group 3). Aberrant splicing, gene expression outliers, and allele-specific expression were analyzed using multiple bioinformatic tools and compared against a GTEx-derived control cohort. Results: RNA-seq contributed to a molecular diagnosis in 29 of 88 evaluable patients (32.9%). Diagnostic yield differed substantially across groups: 82.8% (24/29) in Group 1, 6.9% (2/29) in Group 2, and 10% (3/30) in Group 3. In Group 1, RNA-seq enabled reclassification of candidate VUS through direct demonstration of aberrant splicing events. In Group 2, RNA-seq identified a somatic mosaic ACTB variant missed by exome sequencing and reclassified a previously deprioritized APPL1 VUS. In Group 3, a deep intronic pseudoexon-activating variant in IGBP1 was identified in two siblings with severe microcephaly, providing evidence for a candidate X-linked microcephaly gene, and a pathogenic RNU4-2 variant was detected in a patient with ReNU syndrome, a non-protein-coding gene not captured by standard exome sequencing. Conclusions: RNA-seq has the highest diagnostic utility when applied to evaluate candidate splice variants identified by prior DNA testing but also provides independent diagnostic value in patients without candidate variants. The systematic comparison across stratified patient groups supports the integration of RNA-seq into clinical genomic workflows and highlights the need for standardized analytic frameworks.

5

Measuring the Meaning of Genomic Results: Harmonization of the Metric for Case-Level Results in the CSER2 Consortium

Powell, B. C.; Amendola, L. M.; Bonini, K. E.; Crosslin, D.; Desrosiers-Battu, L.; Hiatt, S. M.; Hindorff, L.; Kenny, E. E.; Mavura, Y.; Muenzen Ferar, K. D.; Risch, N.; Roman, T.; Slavotinek, A.; Van Ziffle, J.; Bowling, K. M.

2026-06-01 genetic and genomic medicine 10.64898/2026.05.28.26354388 medRxiv

Top 0.1%

53.8%

Show abstract

Yield of reported results from genetic testing provides a proximal measure of clinical usefulness. While ACMG/AMP guidelines provide representations of uncertainty for individual genetic variant classification, additional factors are considered when determining whether results explain a patient's presentation. To standardize cross-consortium analysis, a working group of the Clinical Sequencing Evidence-Generating Research (CSER2) consortium iteratively identified factors used when contextualizing variant-level results to case-level interpretation (i.e., interpretation of an individual's genetic data with respect to the indication for testing). Sites independently categorized results; complex cases were discussed collaboratively, leading to revision of classification categories. Our metric incorporates factors beyond classification of reported variants. Analogous to variant-level results, "Definitive Positive" and "Probable Positive" represent certainty that results may be clinically explanatory. The category "Inconclusive" applies when results may or may not fully explain the patient presentation, with subdivision into multiple (non-exclusive) subcategories. Cases falling outside all of the other categories are considered "Negative". The overall diagnostic yield by this metric and use of categories for inconclusive results varied by CSER project, in part paralleling study design differences. This case-level categorization provides a meaningful assessment of diagnostic yield, and for inconclusive cases identifies potentially resolvable factors for case resolution.

6

Determinants of DNA-sequence-based Diagnostic Yield in the CSER Consortium

Mavura, Y.; Crosslin, D.; Ferar, K. D.; Lawlor, J. M.; Greally, J. M.; Hindorff, L.; Jarvik, G. P.; Kalla, S.; Koenig, B. A.; Kvale, M.; Kwok, P.-Y.; Norton, M.; Plon, S. E.; Powell, B. C.; Slavotinek, A.; Thompson, M. L.; Popejoy, A. B.; Kenny, E. E.; Risch, N.

2026-04-22 genetic and genomic medicine 10.64898/2026.04.20.26351140 medRxiv

Top 0.1%

52.9%

Show abstract

PurposeDiagnostic yield from exome and genome sequencing varies widely across studies. It remains unclear how much of this variation reflects patient-level factors (e.g., sex, clinical features, race/ethnicity, genetic ancestry) versus site-level practices such as sequencing modality or variant interpretation workflows. We aimed to quantify the contributions of these factors to diagnostic outcomes across five U.S. clinical sequencing sites. MethodsWe performed a cross-sectional analysis of 3,008 prenatal, neonatal, and pediatric cases from the NHGRI Clinical Sequencing Evidence-Generating Research (CSER) consortium (2017-2023). Clinical indications spanned neurodevelopmental, neurological, immunological, metabolic, craniofacial, skeletal, cardiac, prenatal, and oncologic presentations. Genetic ancestry was inferred from sequencing data, and variants were interpreted using ACMG/AMP guidelines to classify DNA-based diagnoses. Generalized linear mixed models were used to estimate associations between diagnostic yield and fixed effects (sex, prenatal status, isolated cancer, number of clinical indications, sequencing modality, race/ethnicity, and genetic ancestry), while modeling study site as a random effect to quantify between-site variation. ResultsThe overall diagnostic yield was 19.0%. Multiple clinical indications (OR=1.47, 95% CI 1.20-1.80, p<0.001) were associated with higher diagnostic yield, and male sex (OR=0.80, 95% CI 0.66-0.96, p=0.017) and prenatal status (OR=0.63, 95% CI 0.44-0.90, p=0.012) were associated with lower yield. Sequencing modality, race/ethnicity, genetic ancestry, and isolated cancer were not statistically significantly associated with diagnostic outcomes.. A model without fixed effects attributed [~]10% of variance in diagnostic yield to between-site differences. After adjusting for covariates, site-level variance decreased to 5.7%, indicating consistent variation across sites not explained by measured patient factors. ConclusionAcross five sites, patient-level clinical features influenced diagnostic yield, but substantial site-level variation remained even after adjustment. Differences in variant interpretation, or case-classification practices may contribute to this residual variability. Further efforts to increase consistency in exome- and genome-sequencing diagnostic workflows may help reduce inter-site differences.

7

Genome sequencing boosts diagnostic yield for the developmental and epileptic encephalopathies

Munro, J. E.; Thiyagarajah, H.; Bennett, M. F.; Chiu, A. T. G.; Schneider, A. L.; Bennett, C. A.; Lieffering, N.; Allan, T.; Witkowski, T.; Harris, R. V.; Reid, J.; Sikta, N.; Macdonald, S.; Coulter, L.; Dang, Y. L.; Kerkhof, J.; Sadikovic, B.; Perucca, P.; Berkovic, S. F.; Sengupta, S.; LaFlamme, C. W.; Mefford, H. C.; Bahlo, M.; Scheffer, I. E.; Hildebrand, M. S.

2026-04-28 genetic and genomic medicine 10.64898/2026.04.24.26351703 medRxiv

Top 0.1%

51.5%

Show abstract

PurposeAlthough most developmental and epileptic encephalopathies (DEEs) have a monogenic aetiology, routine clinical genetic testing is negative for 50% of patients. We hypothesized that the diagnostic yield could be increased in a large cohort of individuals with unsolved DEEs by applying genome sequencing along with enhanced variant analyses outside of coding regions. MethodsWe performed genome sequencing for 242 participants with DEEs negative on prior genetic testing. We interrogated single nucleotide variants (SNVs), indels, and structural variants in both established and candidate DEE genes. All variants of interest were reviewed, classified, and validated by a multidisciplinary team. ResultsA molecular diagnosis was discovered for 36/242 (15%) participants. The pathogenic or likely pathogenic variants comprised 26 SNVs and indels within coding regions, 9 structural variants, and 5 SNVs and indels in introns or non-coding genes. Variants of uncertain significance were detected in a further 10/242 (4%) participants. ConclusionGenetic diagnostic yield for individuals with unsolved DEEs improves with genome sequencing analysis. This increase reflects both the identification of structural and non-coding variants not detectable on exome or gene panel analysis, and the detection of variants in genes newly associated with DEEs.

8

A structure-aware framework for genomic variant interpretation in genetic skeletal disorders

Piticchio, S. G.; Hosseini, N.; Grigelioniene, G.; Orellana, L.

2026-03-17 genomics 10.64898/2026.03.15.711892 medRxiv

Top 0.1%

42.5%

Show abstract

BackgroundGenetic skeletal disorders (GSDs) comprise a heterogeneous group of rare, predominantly monogenic conditions that are increasingly diagnosed through high-throughput sequencing. While gene discovery has progressed rapidly, interpretation of pathogenic and uncertain variants remains a major bottleneck, in part because their functional consequences are determined at the protein structure level. However, a systematic assessment of structural knowledge across GSD-associated genes is currently lacking. Here, we present a comprehensive protein structure-centric analysis of 674 protein-coding genes implicated in GSDs. MethodsWe integrated experimental structures, AlphaFold2 (AF2) models, multimeric states, protein-protein interactions, and ClinVar variant annotations. ResultsWe quantify experimental structural availability and sequence coverage, revealing that 37% of GSD proteins lack any experimental structure and that, among proteins with structures, sequence coverage is often incomplete. We show that AF2 models provide high-confidence structural information for a substantial subset of proteins lacking experimental data, but that model reliability strongly correlates with existing structural coverage. Analysis of multimeric assemblies and co-occurring partners demonstrates that many GSD proteins function as obligate multimers, highlighting the importance of interface-level interpretation of variants. Finally, mapping clinically annotated missense variants onto representative protein structures illustrates how structural context can inform the interpretation of pathogenic and uncertain variants, particularly at interaction interfaces. ConclusionsTogether, this work provides a structure-aware reference framework for GSD genes, highlighting systematic gaps in current protein knowledge and demonstrating how integration of structural data can guide genomic variant interpretation. Our observations support a broader principle of structural equivalence, whereby distinct variants converge on shared structural perturbations that explain clustering patterns and enable mechanistic interpretation of nearby variants of uncertain significance.

9

Is it time for a paradigm shift? Tailored online video education instead of pretest genetic counseling facilitates high genetic test uptake and informed choice for adults seeking cardiovascular genetic testing

Rivers, B.; Murray, B.; Applegate, C. D.; Tichnell, C.; Gordon, C.; McClellan, R.; Brown, E.; Nunez, K.; Barth, A. S.; Taylor, C. O.; Yanek, L. R.; Day, J.; James, C. A.

2026-06-01 genetic and genomic medicine 10.64898/2026.05.28.26354394 medRxiv

Top 0.1%

41.8%

Show abstract

Background: Pretest genetic counseling (GC) is recommended in conjunction with genetic testing (GT) for cardiovascular (CV) indications, yet access to CVGC is limited leading to delayed GT. Posttest GC could increase GC and GT access but requires efficient pretest education that supports both informed GT decision-making and robust GT uptake. Methods: We developed four indication-tailored online CV genetics education videos and deployed them in a 3-arm randomized trial comparing pretest vs. posttest outpatient CVGC (RESEQUENCE-GC, NCT05422573). Participants were 1:1:1 randomized to pretest video education plus an optional (efficiency arm) or required (flipped arm) phone call with a genetic counselor and planned posttest CVGC or to standard pretest CVGC (SOC arm). Questionnaires administered at baseline and post-education included the CV Multidimensional Model of Informed Choice [MMIC] to quantify GT knowledge and informed GT choice. Results: 389/767 (50.7%) adults aged 18-80 (mean 51.2{+/-}14.9 years) scheduling a first CVGC appointment consented to RESEQUENCE-GC and completed the baseline questionnaire. Efficiency arm participants (video education + optional phone call) were most likely to complete pretest education (134, 97.4% efficiency; 107, 85.6% flipped; 111, 87.4% SOC, p=0.0012) and elect GT (131, 95.6% efficiency; 105, 84.0% flipped; 107, 84.2% SOC, p=0.0036). Few (4, 2.9%) efficiency arm participants requested an optional pretest phone call. Most flipped arm participants (90, 84.1%) had no post-video questions, consistent with the 97 second [IQR: 65s-145s] median call duration. CV genetics knowledge was high post-education (median 8 [IQR 7,8]/8 MMIC items correct). Only video-based pretest education was associated with a significant increase in knowledge (p<0.0001). Nearly all participants made an informed GT choice with no difference between intervention (95.6%) and SOC (90.4%) arms (p=0.074). Conclusions: Tailored, online video pretest education can enhance CV GT uptake, support informed GT decision-making, and be integrated into efficient pretest workflows, suggesting utility in scalable posttest CVGC.

10

Automated Versus Manual Reanalysis In Rare Disease Genomics

Kaschta, D.; Arriens, V.; Mueller, S.; Utermann-Thuesing, C.; Vater, I.; Caliebe, A.; Nagel, I.; Spielmann, M.

2026-05-19 genetic and genomic medicine 10.64898/2026.05.16.26352295 medRxiv

Top 0.1%

41.1%

Show abstract

Purpose. Periodic reanalysis of genome sequencing data can yield additional diagnoses as knowledge evolves, yet manual reanalysis is labour-intensive. We compared automated and manual reanalysis approaches in rare disease genomics. Methods. We reanalyzed 377 rare disease cases: 158 with pathogenic or likely pathogenic (P/LP) findings, 49 with variants of uncertain significance (VUS) findings, and 170 had no findings. Manual reanalysis used standard diagnostic workflow for all cases without prior P/LP diagnoses (219 cases). An automated pipeline using Talos was benchmarked on the 158 P/LP cases before application to the 219-case reanalysis cohort. The mean reanalysis interval was 660 days. Results. Manual reanalysis identified three additional P/LP cases and two newly classified as VUS, increasing P/LP cases from 158 (41.9%) to 161 (42.7%). Talos recovered all three P/LP findings but only identified one of the two new VUS findings. Benchmarking showed 80.0% singleton concordance and 75.2% (82.8% proband-only) trio concordance, with approximately three variants per case. Conclusion. Reanalysis at 1.8 years yields modest but clinically meaning- ful gain. Automated reanalysis closely approximates manual performance while reducing hands-on effort, supporting scalable reanalysis in routine genomic care. Keywords: rare disease genomics, genome sequencing, automated reanalysis, variant prioritization, Talos, diagnostic yield

11

Berrylyzer-an Efficient, Traceable, and Lightweight Intelligent Agentic System for Prenatal Genetic Diagnosis

Meng, M.; Liu, L.; Du, Q.; Zhou, X.; Tian, Y.; Sun, K.; Li, N.; Zhang, P.; Lian, X.; Fan, N.; Zhu, N.; Li, S.; Mao, A.; Li, Y.; Zou, G.

2026-04-04 genetic and genomic medicine 10.64898/2026.04.02.26349929 medRxiv

Top 0.1%

40.7%

Show abstract

Background: Artificial intelligence (AI)-driven variant prioritization has demonstrated substantial utility in expediting genetic diagnosis by ranking the most likely causative variants. While a variety of tools have been developed, few address the unique clinical and technical constraints in prenatal genetic diagnosis. Methods: We introduce Berrylyzer, a novel, end-to-end variant prioritization system applied to prenatal diagnosis.Inspired by clinician's reasoning process during variant interpretation, Berrylyzer applies a modular, stepwise scoring architecture that jointly integrates phenotypic and genomic evidence and delivers a ranked list of candidate variants, achieving high computational efficiency without compromising analytical rigor. Moreover, Berrylyzer natively supports both structured ontologies and free-text clinical narratives, enabling flexible integration into diverse clinical environments. Its performance was rigorously evaluated across two independent, real-world prenatal cohorts and benchmarked against three state-of-the-art methods: Xrare, Exomiser, and PhenIX. Results: Across the two datasets, Berrylyzer ranked 56.41% and 58.12% of diagnostic variants first, and achieved recall rates of 94.02% and 97.42% within top 20, respectively. Berrylyzer outperformed Xrare (85.19% and 87.08%), Exomiser (84.90% and 85.98%), and PhenIX (82.05% and 88.93%). Stratified analysis consistently demonstrated superior performance across diverse disease categories, inheritance patterns, and analytical strategies. Notably, Berrylyzer exhibited robustness regardless of phenotype forms, yielding comparable top 20 recall rates for free-text descriptions and standardized terminologies. Conclusion: Berrylyzer represents an accurate, interpretable, and computationally lightweight variant prioritization system for prenatal genetic diagnosis. The superior performance across heterogeneous diagnostic contexts enables it as a practical solution for seamless integration into clinical pipelines, thereby advancing precision medicine in prenatal settings.

12

Genomic ascertainment of PALB2-related cancer predisposition

Stewart, D.; Kim, J.; Haley, J. S.; Li, J.; Sargen, M. R.; Hong, H. G.; Tischkowitz, M.; McReynolds, L. J.; Carey, D. J.

2026-04-04 genetic and genomic medicine 10.64898/2026.04.03.26349984 medRxiv

Top 0.1%

40.4%

Show abstract

PURPOSE To evaluate cancer risk, age-specific penetrance, and mortality associated with heterozygous pathogenic or likely pathogenic (P/LP) germline PALB2 variants identified through genomic ascertainment and to assess modification by family history of cancer. PATIENTS AND METHODS We conducted a case-control study in two large population-based adult cohorts: the UK Biobank (n=469,580) and Geisinger MyCode (n=167,050). Individuals with heterozygous PALB2 P/LP variants were identified via exome sequencing and compared with non-carriers. Cancer diagnoses and vital status were obtained from linked registry and electronic health record data. We used multivariable logistic regression to estimate odds ratios (ORs) for cancer outcomes and Cox proportional hazards models to estimate hazard ratios (HRs) for all-cause mortality. Age-specific cumulative incidence (penetrance) was estimated using Kaplan-Meier methods. Models were adjusted for birth year, sex (when applicable), smoking status, and body mass index; stratified analyses assessed modification by family history of cancer. RESULTS PALB2 P/LP variant prevalence was 1:571 in UK Biobank and 1:940 in MyCode, with the higher prevalence in the UK cohort driven by the PALB2 p.Trp1038Ter founder variant. Compared with non-carriers, heterozygotes had significantly increased odds of any cancer, female breast cancer, pancreatic cancer, and cancers of ill-defined or secondary sites in both cohorts (P < 0.01). Adjusted hazard ratios for any cancer and female breast cancer ranged from 1.7 to 3.6. All-cause mortality was increased among PALB2-heterozygotes (HR 1.61-1.67), and survival after cancer diagnosis was reduced. Family history further modified cancer risk. CONCLUSION Genomic ascertainment of PALB2-heterozygotes identifies elevated risk for multiple cancers and increased mortality, although risks were lower than estimates from familial ascertainment. These findings inform risk management for individuals identified through genomic screening.

13

Comprehensive analysis of de novo variants across 2,497 orofacial cleft trios reveals novel genetic drivers of disease

Kurtas, N. E.; Sanchis-Juan, A.; Shin, E.; Curtis, S. W.; Robinson, K. R.; Lee, A. S.; Alade, A. A.; Zhao, X.; Fu, J.; Diaz Perez, K. K.; Gowans, J. J. L.; Eshete, M. A.; Adeyemo, W. L.; Buxo, C. J.; Padilla, C. D.; Poletta, F. A.; Carreno Torres, A.; Wehby, G. L.; Hecht, J. T.; Moreno Uribe, L. M.; Mukhopadhyay, N.; Shaffer, J. R.; Weinberg, S. M.; Murray, J. C.; Beaty, T. H.; Butali, A.; Talkowski, M.; Marazita, M. L.; Leslie-Clarkson, E. J.; Brand, H.

2026-05-24 genetic and genomic medicine 10.64898/2026.05.21.26352934 medRxiv

Top 0.1%

38.4%

Show abstract

Background Orofacial clefts (OFCs) and other palate abnormalities (PAs) are among the most common birth defects worldwide and are characterized by the abnormal formation of the lip and/or palate. Genetic studies have traditionally classified OFC cases as either syndromic, involving OFCs alongside other congenital anomalies, or nonsyndromic, which represent the majority of cases and occur in isolation. Emerging genomic evidence indicates that genes traditionally associated with syndromic forms of OFC can also harbor variants contributing to isolated cases, challenging the notion of a strict dichotomy between these categories and supporting their integration for gene discovery. Methods In this study, we applied multiple analytic approaches to characterize the genetic architecture of OFC and PAs by integrating genomic data from 2,497 trios with an OFC (n=2080) and PA (n=417) affected proband. We compared these findings across OFC subtypes and syndromic status with those from 5,515 control trios to identify enriched biological pathways and mechanisms and to prioritize candidate genes using variant burden testing. Results We observed a significant enrichment of de novo protein-truncating and damaging missense variants in cases compared to controls (OR = 2.17, p = 1.21x10-32), with particularly strong signals in biologically relevant gene sets involving OFC-associated, constrained, Mendelian disorder, and mouse candidate genes. Variant burden testing identified 39 OFC risk genes at FDR [≤] 0.05, which we then integrated with 593 established OFC genes to interrogate the functional underpinnings of OFC via network analysis. This analysis revealed 309 high-order interactor genes not previously associated with OFC. Notably, this OFC network clustered into ten distinct biological pathways, with nucleosome-associated genes showing significant enrichment among cases in our cohort (OR = 14.8, p = 8.1x10-4). In a final integrative step, we combined evidence across all analyses to nominate 231 candidate genes, 32 of which contained at least two deleterious de novo variants in our cohort. Conclusions These findings underscore the value of integrating diverse OFC and PA subtypes, syndromic status, and variant classes to refine the genetic architecture of these disorders, highlighting both phenotypic expansion of known disease genes and the emergence of novel gene-phenotype associations.

14

Prevalence and Clinical Significance of Adult-Onset Cancer Predisposition Variants in Pediatric Oncology

Maciaszek, J. L.; Pastor Loyola, V.; Cain, T.; Cardenas, M.; Blackburn, P. R.; Wilkinson, M. R.; Koo, S. C.; Wu, C.-H.; Li, C.; Wang, L.; Nichols, K. E.; Klco, J. M.; Eldomery, M. K.

2026-06-08 genetic and genomic medicine 10.64898/2026.06.07.26354365 medRxiv

Top 0.1%

38.0%

Show abstract

Purpose: Pathogenic or likely pathogenic (P/LP) variants are increasingly identified in genes more commonly associated with adult-onset cancer predisposition, but their prevalence and relevance to children who present with cancer remain unclear. Methods: We retrospectively analyzed 1,280 consecutive pediatric patients with cancer who underwent clinical germline sequencing, using a virtual panel, from 2021 to 2024. Genes with P/LP variants were categorized as aoCPG or pediatric-onset cancer predisposition genes (poCPG) according to cancer risk before age 18 years and pediatric surveillance recommendations. Variant relevance was adjudicated using tumor diagnosis/histopathology, immunohistochemistry, and tumor molecular features and classified as primary, secondary, or indeterminate. Results: Among 1,280 patients, 197 (15.4%) harbored 211 P/LP variants across 54 genes. Sixty-six variants (31.3%) occurred in aoCPG, 87 (41.2%) in poCPG, and 58 (27.5%) were heterozygous variants in autosomal recessive genes. Among adult-onset variants, 7 (10.6%) were primary, 54 (81.8%) secondary, and 5 (7.6%) indeterminate. Among pediatric-onset variants, 77 (88.5%) were primary and 10 (11.5%) secondary. Six patients (3 adult-onset variants; 3 pediatric-onset variants) received targeted therapy informed by germline/somatic sequencing results. Conclusion: In pediatric oncology, most variants in aoCPG are secondary rather than tumor-related findings. Tumor-informed interpretation, beyond variant classification, may improve reporting, counseling, and therapeutic decision-making

15

Advancing precision medicine in the Cardiac Intensive Care Unit using universal whole-genome sequencing

Kierulf, G.; Emmerson, M.; Krautscheid, P.; Bleyl, S.; Tristani-Firouzi, M.; Sawyer, B.

2026-05-14 genetic and genomic medicine 10.64898/2026.05.11.26352916 medRxiv

Top 0.1%

37.8%

Show abstract

Congenital heart defects (CHD) are a common congenital anomaly and a leading cause of neonatal mortality. Even in ostensibly isolated cases, genetic testing can reveal monogenic causes of isolated CHD or identify syndromic conditions before additional features become clinically apparent. A timely and accurate genetic diagnosis can inform medical management and surveillance, reduce the need for unnecessary investigations, and offer families valuable information about prognosis, recurrence risk, and anticipatory guidance. In September of 2023, Primary Childrens Hospital introduced a universal genetic testing protocol that implemented whole genome sequencing for all neonates admitted to the cardiac intensive care unit (CICU) undergoing cardiac surgery before 30 days of life, with the goal of increasing the number of patients who receive a timely genetic diagnosis and improving clinical care. This is a retrospective chart review of patients who underwent whole genome sequencing (WGS) under the new universal genetic testing protocol at Primary Childrens Hospital from its initiation in September 2023 to February 2026. Over the study period, 217 neonates with CHD participated in the universal WGS protocol. Of these patients, 23 (10.6%) received a genetic diagnosis that was causative of their CHD, of which 11 patients (48%) had no major extracardiac features at the time testing was ordered. Twenty patients were diagnosed with a syndromic condition, and three patients were diagnosed with a non-syndromic condition. All of these patients received additional referrals to specialists following their new diagnosis, and six families used results to inform decisions regarding continuation of care. An additional 19 patients (8.8%) received WGS results that were clinically relevant but non-diagnostic for their CHD, including partial diagnoses, secondary findings, and carrier status. In total, 19.4% of patients (n=42) had clinically relevant variants identified on their WGS.

16

Artificial Intelligence-Based Chatbots in Genetic Counseling Practice: Current Uptake, Utilization, and Perspectives

Daley, N.; Griswold, A.; Moreno, L.; Floyd, A.; Duong, D.; Solomon, B. D.; Waikel, R. L.

2026-05-24 genetic and genomic medicine 10.64898/2026.05.21.26353789 medRxiv

Top 0.1%

35.5%

Show abstract

AI-driven chatbots have been utilized in healthcare to automate administrative tasks, improve patient education, and expand access to medical information; however, their role in genetic counseling remains underexplored. To investigate the adoption, perceptions, and potential utility of AI-based chatbots in genetic counseling practice, 217 genetic counselors and genetic counseling students from across North America were surveyed regarding chatbot usage, confidence in their application, and perceived benefits and limitations. While most participants (166/217; 76.5%) reported using general AI chatbots outside of clinical settings, far fewer (18/204; 8.8%) reported using or recommending clinical genetics chatbots in clinical practice. For those that used clinical genetics chatbots, the primary purpose was for communication with at-risk family members (11/18; 61.1%) and patient education (10/18; 55.6%). Confidence in chatbot technology varied, with highest confidence in gathering family history information (81/199; 40.7%) and lowest confidence in their ability to disclose variants of uncertain significance or positive genetic testing results (5/199; 2.5%). The greatest perceived benefits included reducing repetitive tasks (165/195, 84.6%) and allowing for time for other tasks (141/195; 72.3%), while major concerns revolved around patient comprehension (167/195; 85.6%) and having accurate, up-to-date information (145/195; 74.4%). Despite some concern about AI replacing human counselors, most participants reported they felt there was potential for chatbots to enhance workflow efficiency (128/195; 65.6%) if properly integrated and regulated. Limited AI training was identified as a barrier to adoption (16/195; 8.2% received training), highlighting a need for structured education on AI applications in genetic counseling. These findings suggest that AI chatbots hold promise as supplementary tools, but significant challenges must be addressed before widespread implementation in genetic counseling practice.

17

Diagnostic Accuracy of Large Language Models for Rare Diseases: A Systematic Review and Meta-Analysis

Nguyen, M.-H.; Yang, C.-T.; Cassini, T. A.; Ma, F.; Hamid, R.; Bastarache, L.; Peterson, J. F.; Xu, H.; Li, L.; Ma, S.; Shyr, C.

2026-03-27 genetic and genomic medicine 10.64898/2026.03.26.26349194 medRxiv

Top 0.1%

35.2%

Show abstract

Background: Large language models (LLMs) have been evaluated as tools to assist rare disease diagnosis, yet evidence on their accuracy remains fragmented. We conducted a systematic review and meta-analysis to synthesize the available evidence on the diagnostic performance of LLMs, identify sources of heterogeneity, and evaluate the current evidence base for clinical translation. Methods: We searched PubMed, Embase, Web of Science, Cochrane Library, arXiv, and medRxiv (January 2020-February 2026). Full-text articles and preprints were considered for inclusion. Eligible studies applied LLM-based systems to generate differential diagnoses for rare diseases and provided Recall@1 (R@1; proportion with the correct diagnosis ranked first). We pooled R@1 using Freeman-Tukey double arcsine transformation with DerSimonian-Laird random-effects models. Pre-specified subgroup analyses examined LLM knowledge augmentation strategy and input modality. Because both retained high residual heterogeneity, we conducted a post-hoc exploratory analysis of evaluation benchmark disease composition, mapping diseases from major benchmarks to Orphanet prevalence classifications. Risk of bias was assessed using a modified QUADAS-3 instrument. Findings: We identified 902 records, of which 564 were screened and 15 studies were eligible. These 15 studies contributed 19 system-dataset entries to the meta-analysis (total N=39,529 cases). The pooled R@1 was 43.3% (95% CI 35.1-51.6; I2=99.6%). Augmented LLM systems (agent-based reasoning, retrieval, or fine-tuning; k=8) achieved R@1 of 52.5% (42.0-62.9) versus 35.4% (30.6-40.4) for standalone LLMs (k=11; p=0.004). Post-hoc exploratory analysis indicated that evaluation benchmark disease composition was associated with differences in diagnostic performance: R@1 was lower on the Phenopacket Store dataset, which contained a higher proportion of ultra-rare diseases (52.8%; k=2), than on RareBench (29.3%; k=6) at 21.7% (18.2-25.5) versus 52.0% (40.7-63.2; p<0.001). All 19 system-dataset entries were assessed to be at high risk of bias, most commonly due to potential data leakage and limited reproducibility. No study provided prospective clinical validation. Interpretation: Diagnostic performance of LLM-based systems for rare diseases varied substantially across evaluation benchmarks. Post-hoc exploratory analysis indicated that performance was associated with benchmark disease composition. Performance was higher in benchmarks containing fewer ultra-rare diseases and in systems incorporating external knowledge at inference time. However, all included studies were at high risk of bias, and none reported prospective clinical validation. These findings highlight the need for prevalence-stratified evaluation benchmarks and independent prospective studies before clinical deployment. Funding: This work was supported in part by the National Institutes of Health Common Fund, grant 15-HG-0130 from the National Human Genome Research Institute, U01NS134349 from the National Institute of Neurological Disorders and Stroke, R00LM014429 from the National Library of Medicine, and the Potocsnak Center for Undiagnosed and Rare Disorders.

18

Sequencing for a Lifetime: Value, Feasibility, and the Governance Gap in Lifelong Genomic Medicine

Lewis, A. C. F.; Holm, I. A.; Buchanan, A. H.; Goldenberg, A. J.; Knoppers, B. M.; McGuire, A. L.; Green, R. C.

2026-04-30 genetic and genomic medicine 10.64898/2026.04.29.26352046 medRxiv

Top 0.1%

34.3%

Show abstract

BackgroundA vision of lifelong genomic medicine, in which stored genomic data can inform a lifetime of care has long animated the field of genomic medicine. Component pieces of this vision are being researched or are already in clinical practice, including dozens of projects around the world sequencing healthy newborns, along with reanalysis of stored genomic data. Whether lifelong genomic medicine is desirable, and, if so, whether it is feasible, has not been explored in the literature. Methods and FindingsWe conducted and thematically analyzed interviews with over 50 US-based healthcare professionals, including clinical geneticists, genetic counselors, primary care clinicians, laboratory personnel, and those who have implemented genomic screening in health systems. We found broad endorsement of the value of lifelong genomic medicine across groups. Perceived clinical value stemmed from the existence of genomic information relevant at multiple stages of life, the ability to query the genome if an individuals medical circumstances change, and the ability to inform patients about relevant evolving scientific advances. Participants also articulated an efficiency argument for reanalyzing stored genomic data rather than retesting. The clinical value was contested by a few participants, who argued for more targeted testing for the clinical situation and disputed the efficiency argument. Many participants viewed the model as inevitable, with operational precedent already established for many component activities. The feasibility of lifelong genomic medicine was limited not by scientific barriers but by governance gaps spanning delivery models, consent, data stewardship, recontact, and the pediatric-to-adult transition. These gaps have equity implications that are cumulative and mutually reinforcing. ConclusionsThe concept of lifelong genomic medicine was widely viewed as acceptable and desired. However, until the governance infrastructure is established, including accountability, funding, data stewardship, and recontact mechanisms, population-scale genomic sequencing risks proceeding faster than the frameworks needed to make it responsible.

19

Investigating Uptake and Impact of Genetic and Genomic Evaluation Following a Perinatal Demise

Mossler, K.; D'Orazio, E.; Hall, K.; Osann, K.; Kimonis, V.; Quintero-Rivera, F.

2026-04-23 genetic and genomic medicine 10.64898/2026.04.22.26347546 medRxiv

Top 0.1%

33.6%

Show abstract

ObjectiveThe decline of the perinatal demise rate is slowing and demises are often unexplained. Significant research has been done regarding diagnostic yield and genetic causes of demise, but little is known about how Geneticist involvement impacts outcomes. The goal of the study was to evaluate post-mortem genetic testing practices and effects of the geneticists involvement. MethodsRetrospective data from 111 perinatal demise cases was examined, including rates of prenatal genetic counseling, post-delivery genetics consult, genetic testing, and autopsy investigation. ResultsIn this cohort 54% received genetic testing and 25% received a genetics consultation. When compared to those without, cases with genetic specialist involvement were associated with significant increases in testing uptake (p=0.007), diagnostic yield (p<0.001), and patient education (p<0.001). Second trimester stillbirths and those with fewer ultrasound (US) abnormalities were less likely to receive genetic testing (both p values <0.001) and consults (p<0.001, p=0.020). ConclusionAlthough ascertainment bias cannot be ruled out, this data demonstrates that geneticist involvement correlates with a higher rate of testing, greater diagnostic yield, and more thorough counseling. These findings underscore the importance of integrating genetics providers into perinatal postmortem healthcare teams. What is already known about this topic?- Causes of perinatal demise often are undiagnosed, but genetic and congenital anomalies are common. - ACOG recommends genetic testing for all perinatal demises What does this study add?- Genetic testing is under-offered and should be offered more frequently. - Genetic specialist involvement is associated with increased patient education, genetic testing uptake, and diagnostic yield - Time and access to genetic specialists may drive testing rate - Non-English language may be associated with decreased consultation rate

20

SafeGene:A Novel Computational Platform for Predictive Genetic Screening of Offspring Disease Risk Using Region-Specific Population Genetics, Mendelian Inheritance Models, and Consanguinity Coefficient Analysis in Saudi Arabia and the Gulf Cooperation Council States

ahmed, a. K.; Rodaini, s.

2026-03-30 genetic and genomic medicine 10.64898/2026.03.28.26349627 medRxiv

Top 0.1%

33.4%

Show abstract

Background: Saudi Arabia bears a disproportionate burden of autosomal recessive genetic disorders, driven by consanguineous marriage rates of 50 to 58% and elevated carrier frequencies for conditions such as sickle cell disease (carrier rate up to 25%), betathalassemia (12%), and spinal muscular atrophy (6%). The existing premarital screening program screens for only two conditions. We developed SafeGene, a computational platform that expands predictive genetic screening to 50+ conditions using region specific population genetics. Methods: SafeGene integrates five risk calculation engines: (1) Mendelian inheritance models for AR, AD, XR, and XD conditions; (2) Hardy Weinberg equilibrium based carrier probability estimation using Saudi, Gulf, and global databases; (3) a six level consanguinity coefficient calculator (F = 0 to 1/8) with risk amplification multipliers; (4) multifactorial polygenic risk models for 12 complex diseases; and (5) maternal age dependent trisomy risk curves. Built using React.js, Node.js/Express, and MongoDB with bilingual Arabic/English support. Results: The platform encompasses 50 genetic conditions across 12 categories. Validation against published Saudi data demonstrated concordance with observed disease frequencies. Economic modeling projects that expanding screening could prevent 2,800 to 4,200 affected births annually, yielding savings of SAR 1.2 to 2.8 billion ($320 to 746 million USD) per year. Conclusions: SafeGene represents a scalable, evidence-based digital health solution for comprehensive genetic screening addressing the unique population genetics of consanguineous Gulf societies. The platform is protected under pending patent applications in South Africa (CIPC) and Saudi Arabia (SAIP).